Goto

Collaborating Authors

 curriculum learning



926ffc0ca56636b9e73c565cf994ea5a-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their valuable comments. We are glad that reviewers noted our paper as novel (R1: "idea is "Decouple the effect of capacity increase and curriculum learning": We would like to We will also move related works section as suggested. We agree that this issue is important in the field of curriculum learning. "It could be interesting to show results on the large W ebVision Benchmark": "W ould proposed curriculum change robustness to adversarial attacks": On average, our method requires 20 % fewer epochs. ImageNet, we conducted new experiments on WebVision dataset (2.3 million training images) and obtain significant Please see the first table above.







2cfa8f9e50e0f510ede9d12338a5f564-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their feedback. Our'formulation is generic and task-agnostic and therefore has the potential'The model simplifies existing work' ( R1) and'has been applied to many loss functions and tasks without any change'The experiments cover different tasks and benchmark datasets' ( R3). 'It is misleading to claim that the paper is the first work using task-agnostic weights that do not require iterative W e do not make such a claim . We believe a simple and easy-to-use idea has potential for great impact. We review (in Section 2.1 and Section 1 from the supplementary) We therefore propose in Section 2.2 the Section 2.3); (2) handle both positive-and negative-valued losses (which justifies the squared regularizer log term'Does not brings notably new criteria in determining the sample weights' (R3.3). 'SuperLoss does not show an advantage on clean data' (R3.4).


Curriculum Learning With Infant Egocentric Videos

Neural Information Processing Systems

Infants possess a remarkable ability to rapidly learn and process visual inputs. As an infant's mobility increases, so does the variety and dynamics of their visual inputs. Is this change in the properties of the visual inputs beneficial or even critical for the proper development of the visual system? To address this question, we used video recordings from infants wearing head-mounted cameras to train a variety of self-supervised learning models. Critically, we separated the infant data by age group and evaluated the importance of training with a curriculum aligned with developmental order. We found that initiating learning with the data from the youngest age group provided the strongest learning signal and led to the best learning outcomes in terms of downstream task performance. We then showed that the benefits of the data from the youngest age group are due to the slowness and simplicity of the visual experience. The results provide strong empirical evidence for the importance of the properties of the early infant experience and developmental progression in training. More broadly, our approach and findings take a noteworthy step towards reverse engineering the learning mechanisms in newborn brains using image-computable models from artificial intelligence.


Curriculum Learning for Graph Neural Networks: Which Edges Should We Learn First

Neural Information Processing Systems

Graph Neural Networks (GNNs) have achieved great success in representing data with dependencies by recursively propagating and aggregating messages along the edges. However, edges in real-world graphs often have varying degrees of difficulty, and some edges may even be noisy to the downstream tasks. Therefore, existing GNNs may lead to suboptimal learned representations because they usually treat every edge in the graph equally. On the other hand, Curriculum Learning (CL), which mimics the human learning principle of learning data samples in a meaningful order, has been shown to be effective in improving the generalization ability and robustness of representation learners by gradually proceeding from easy to more difficult samples during training. Unfortunately, existing CL strategies are designed for independent data samples and cannot trivially generalize to handle data dependencies. To address these issues, we propose a novel CL strategy to gradually incorporate more edges into training according to their difficulty from easy to hard, where the degree of difficulty is measured by how well the edges are expected given the model training status. We demonstrate the strength of our proposed method in improving the generalization ability and robustness of learned representations through extensive experiments on nine synthetic datasets and nine real-world datasets.